Accelerating approximate aggregation queries with expensive predicates

نویسندگان

چکیده

Researchers and industry analysts are increasingly interested in computing aggregation queries over large, unstructured datasets with selective predicates that computed using expensive deep neural networks (DNNs). As these DNNs because many applications can tolerate approximate answers, accelerating via approximations. Unfortunately, standard query processing techniques to accelerate such not applicable they assume the result of available ahead time. Furthermore, recent work cheap approximations (i.e., proxies) do support predicates. To predicates, we develop analyze a algorithm leverages proxies (ABae). ABae must account for key challenge it may sample records satisfy predicate. address this challenge, first use proxy group into strata so satisfying predicate ideally grouped few strata. Given strata, uses pilot sampling plugin estimates according optimal allocation. We show converges at an rate novel analysis stratified draws further outperforms on baselines six real-world datasets, reducing labeling costs by up 2.3x.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Answers for XML Queries with Range Predicates

In this paper, we tackle the difficult problem of summarizing the path/branching structure and numerical value content of an XML database. We introduce a novel, powerful XML-summarization model, termed VTreeSketches, that enables accurate approximate answers for the class of twig queries with numerical-range predicates. In a nutshell, a VTreeSketch synopsis represents an effective clustering of...

متن کامل

Optimal Ordering of Selections and Joins in Acyclic Queries with Expensive Predicates

The generally accepted optimization heuristics of pushing selections down does not yield optimal plans in the presence of expensive predicates. Therefore, several researchers have proposed algorithms for the optimal ordering of expensive joins and selections in a query evaluation plan. All of these algorithms have an exponential run time. For a special case, we propose a polynomial algorithm wh...

متن کامل

Filtering with Approximate Predicates

Approximate predicates can be used to reduce the number of comparisons made by expensive, complex predicates. For example, to check if a point is within a region (expensive predicate) we can first check if the point is within a bounding rectangle (approximate predicate). In general, approximate predicates may have false positive and false negative errors. In this paper we study the problem of s...

متن کامل

Temporal Aggregation with Range Predicates

A temporal aggregation query is an important but costly operation for applications that maintain time-evolving data (data warehouses, temporal databases, etc.). Due to the large volume of such data, performance improvements for temporal aggregation queries are critical. Previous approaches have aggregate predicates that involve only the time dimension. In this paper we examine techniques to com...

متن کامل

Spatial Queries with Two kNN Predicates

The widespread use of location-aware devices has led to countless location-based services in which a user query can be arbitrarily complex, i.e., one that embeds multiple spatial selection and join predicates. Amongst these predicates, the k-Nearest-Neighbor (kNN) predicate stands as one of the most important and widely used predicates. Unlike related research, this paper goes beyond the optimi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the VLDB Endowment

سال: 2021

ISSN: ['2150-8097']

DOI: https://doi.org/10.14778/3476249.3476285